IT INFRASTRUCTURE LIBRARY (ITIL )
1 Executive Summary
Today's
businesses rely on the many business processes and services supported by IT
infrastructure. It is well known that IT failures often lead to significant adverse
impact on IT services and hence the business. With this in mind, companies are
seeking to align IT with business objectives to ensure that the IT
infrastructure consistently supports the business.
To
help accomplish this goal, ITIL is becoming an increasingly popular standard as
it defines a set of best practices for IT Service Management, thereby ensuring
that business requirements are cost-effectively met.
ITIL
benefits both the customer/user and the IT organization. As IT services become
more clearly focused on business objectives and service agreements, customers
will see an improvement in their business relationships. Further, this
alignment enables IT to better describe services in language the customers can
understand, and improves communication through defined points of contact. ITIL
empowers the IT organization to improve efficiency through standardized
processes, such as easily implementing changes in the infrastructure to
continuously improve IT services. The role of IT becomes more clearly defined
as it is integrated with business objectives and critical decisions are more
easily made. Finally, the quality and cost of the services are better managed,
providing the required service levels at acceptable costs.
The
Concord SPECTRUM Business Unit is dedicated to helping businesses accomplish
their ITIL efforts. We provide a full suite of products to enable effective IT
Service Management. Further, where ITIL processes expand beyond activities that
can be achieved by technology alone, our business processes help organizations
implement these ITIL best practices. This paper gives a high-level overview of
the ITIL processes, and shows how SPECTRUM maps to these processes to support
your ITIL efforts.
2 Introduction to
ITIL
ITIL
is the IT Infrastructure Library which was originally defined by the Central
Computer and Telecommunications Agency of the UK government (CCTA) in the
1980s. The CCTA has become the Office of Government Commerce (OGC) and now owns
ITIL. The Information Technology Service Management Forum (itSMF) is an
international, independent user group that has become a major influence on the
best practices of IT Service Management and has embraced ITIL to do so. itSMF
also continues to contribute to ITIL.
2.1.1 Overview of the ITIL Processes
ITIL
provides detailed process definitions for many IT functions that can be adapted
to any IT organization. Actually, most processes defined in detail by ITIL are
already partially implemented in most IT organizations. The main focus of ITIL
processes is on IT Service Management. ITIL consists of a set of 11 Processes
and 1 Function all working together to deliver effective IT Service Management.
The sections defined by ITIL are:
·
Service Desk. This is the function described
in ITIL and is the initial point of contact between the IT organization and
users. It is responsible for many ITIL Processes · Incident Management is the
process focusing on solving incidents and restoring services quickly.
·
Problem Management is the process focusing on
solving root cause problems to prevent future incidents.
·
Configuration Management is the process that
keeps all required information about services, service components,
relationships, and other items accurate and up to date.
·
Change Management is the process that controls
the implementation of changes in the infrastructure. · Release management is
the process that controls the rollout of new releases in the infrastructure.
·
Service Level Management
is the process that defines and implements clear agreements for service
delivery between the customers and the IT organization. · Financial Management
for IT Services is the process that ensures the sensible management,
maintenance, and operation of IT Services from a financial standpoint. ·
Capacity Management is the process that optimally manages capacity to meet the
service requirements at an acceptable cost.
·
Availability Management is the process that
ensures the availability of IT resources and hence the availability of IT Services
to meet a greed upon service levels. · IT Service Continuity Management is the
process that focuses on defining and maintaining appropriate disaster recovery
plans for IT Services. · Security Management is the process that ensures the
proper access to services as defined by the service agreements and prevents
unauthorized use.
It
is important to note that all these processes are heavily interrelated and
dependent upon one another. There are typically blurry lines where one process
ends and another starts. A single example of this interdependency is that
Incident Management, Configuration Management, Problem Management, Release
Management, Service Level Management, Availability Management, Capacity
Management, and IT Service Continuity Management are all directly linked to
Change Management.
2.1.2 Benefits of ITIL
There
are many benefits associated with the successful implementation of ITIL. ITIL
best practices allow IT organizations to deliver the optimal service levels to
their customers based on balancing the performance and cost of the services
with the business requirements. Relationships between the provider (internal or
external) and customer are also improved through added customer focus and
through SLAs allowing both parties to have a mutual understanding of the
requirements and the delivery. Additionally, implementing ITIL best practices
makes IT Service Management more efficient, again by focusing on delivering the
required business services at the agreed upon service levels and by having well
defined processes for performing the required management tasks. Using the well
defined processes and best practices helps eliminate problems while increasing
service levels.
2.1.3 ITIL Adoption
ITIL
is most prevalent in Europe as this is where it was initially developed.
However, ITIL implementation is quickly gaining momentum worldwide as
businesses are moving toward IT Service Management and looking for industry
best practices to help them be effective and provide the optimal service
delivery. ITIL implementation is not limited to any vertical industry; it is
being embraced by service providers, enterprises, and many government and
military organizations. ITIL is particularly prevalent in businesses who are
signing up to Service Level Agreements (SLAs) whether these are for internal
service delivery within an organization or between customers and external
service providers.
We
are dedicated to helping with the adoption of ITIL best practices through an
extensive product suite enabling effective service management and through
business practices that support a customer’s ITIL processes.
3 Concord’s Support
for ITIL in IT Service Management
This section
outlines each process and how the Concord SPECTRUM Business Unit helps support
it in an IT organization.
3.1 Service Desk
The
service desk is actually a function not a process. It acts as the single point
of contact for all users and can perform multiple ITIL processes. The goal is to
support the services that are to be delivered to the users and ensure timely
response to issues. As the first line of support, the service desk reduces the
workload on the 2nd, 3rd, nth levels of support by taking care of the simple
calls, and provides better response to users since they only have one contact
for reporting and following up on issues.The Service Desk primarily handles the
Incident Management process; however it also plays a role in Release Management
and Change Management.
Responsibilities
of the service desk include:
·
Responding to user’s calls including logging
and tracking all calls. The two types of calls are Incidents and Standard.
Incidents include errors which are actual faults in the IT Services and Service
Requests which are requests for information regarding the service such as how
to perform a task, requests to reset passwords, requests for replacement of
non-tracked equipment such as power cords, etc. Standard requests include
changes such as PC upgrades, network connection changes, account setup, etc.
Responding to user’s calls also includes resolving any incidents that have
standard solutions and sharing information with other ITIL processes where
appropriate.
·
Providing proactive information such as
current or expected errors and information regarding services and SLAs to
customers and users. · Communicating with suppliers for replacement of hardware
and software
·
Performing operational tasks such as backups,
account creation, password creation and resetting, new network connections,
etc. · Monitoring the infrastructure for faults and root causes to faults and
automatically notifying Incident Management .
The
SPECTRUM product suite supports the service desk in many ways. SPECTRUM
monitors the IT infrastructure and uses root cause analysis to quickly pinpoint
faults. Using the One Click Console and Service Dashboard, service desk
personnel have the information at their fingertips to not only answer calls to
the service desk quickly with the proper information, but also act proactively
to communicate problems and help fix them before users call. This is expedited
through Spectrum’s probable cause and recommended actions files which help
service desk personnel understand solutions to the problems. The Remedy Gateway
and other trouble ticket integrations automatically create tickets for SPECTRUM
found issues and also allow bi-directional functionality to automatically clear
alarms in SPECTRUM as tickets are closed or automatically update tickets as
alarm status changes. This eliminates much of the manual effort required by the
service desk to streamline the operation. Further, SPECTRUM Alarm Notification
Manager (SANM) and Attention provide automated alarm escalation and
notification so the appropriate people are aware of the issues and can address
them in an efficient manner again reducing manual workload of the service desk.
3.2 Incident Management
As
defined by ITIL, “an incident is any event which is not part of the standard operation
of a service and causes, or may cause, an interruption to, or a reduction in
the quality of that service.” This differs from a problem in that an “incident”
has a known workaround or fix while a “problem” is the underlying root cause of
an incident that when fixed, will prevent future incidents. Incidents include
service requests to do things such as reset passwords or provide documentation
in addition to errors in the infrastructure.
The
goal of Incident Management is to restore services to guaranteed levels quickly
and efficiently to limit the impact of incidents on business processes.
Further, Incident Management maintains logs of incidents and their solutions to
make it easier to solve similar incidents in the future and make this
information available to other processes.
Responsibilities
of Incident Management include:
·
Accepting and recording incidents whether they
are found by a user, management system, or IT personnel. Each incident is
recorded with a tracking number, basic yet detailed information about the
incident (such as time, affected users, affected service, etc), and
supplemental information from other sources. Incident Management also alerts
other impacted users for high impact incidents.
·
Classifying incidents with a category (type of
incident), priority based on urgency and impact, related services, time to
repair requirements of the SLA, and incident identification.
·
Matching new and previous incidents looking
for a workaround used for a similar incident in the past. · Investigating and
diagnosing incidents to find a solution.
·
Resolving and recovering incidents including
submitting Request for Changes (RFCs) to the Change Management process. ·
Monitoring and tracking incidents to keep users and other ITIL processes up to
date with any changes. · Closing incidents by verifying the solution with the
person who reported the incident.
In
support of this process, the SPECTRUM integrated application suite detects
incidents automatically and finds the root cause often before users report it
to the Service Desk. OneClick Console and the Service Dashboard put this
information at the fingertips of the Incident Management staff enabling a quick
turnaround, thus limiting the performance and availability impact on the
service. The information provided includes the details required to classify the
incident including the priority based on the impact to business services and
customers.
To
aid in accepting and recording incidents, trouble ticket software integrations simplify
the ticketing process by automatically creating tickets based on incidents
found by SPECTRUM Service Manager. The integration with Remedy also allows
automatic ticket closing when incidents have been resolved.
SPECTRUM
Alarm Notification Manager and an integration with Attention software also help
automate escalations ensuring the appropriate support personnel are notified of
the incidents in a timely manner allowing faster solution. Once notified, the
support groups can take advantage of SPECTRUM Report Manager and the OneClick
console to easily study current incidents, match them to previous incidents,
and find solutions.
3.3 Problem Management
As
defined by ITIL, “a problem is the unknown underlying cause of one or more
Incidents. It will become a Known Error when the root causes are known and a
temporary workaround or a permanent alternative has been identified.”
The
goal of Problem Management is to find the root causes of errors and potential
errors and make sure they are fixed to limit incidents in the future. This goes
a step further than Incident Management where Incident Management only resolves
the single incident whether this includes fixing the root cause or not.
By
limiting incidents it not only allows the business processes to run more
smoothly, increasing customer satisfaction and profit opportunity, but also
reduces workload on the IT organization.
Responsibilities
of Problem Management include:
·
Problem control to identify problems and
diagnose them to find the root cause. This step includes recording problem
details including the related incidents. · Error control to monitor and fix
known errors where appropriate based on SLAs, costs to fix, and impact.
·
Proactive Problem Management focusing on
quality of the IT Infrastructure using trend analysis to find potential
problems before incidents occur. · Providing information regarding known errors
and workarounds to other ITIL processes.
Supporting
this process, the SPECTRUM suite of products identifies and isolates the root
causes of the problems in the infrastructure, allowing them to become known
errors. The SPECTRUM core analysis engine leverages patented fault isolation
and root cause analysis capabilities to determine problems on an infrastructure
component basis. SPECTRUM’s Service Manager application enables root cause
correlation on a service and customer basis. Further, Service Manager
prioritizes issues based on impact and urgency allowing the root cause issues
to be addressed effectively and efficiently. In cases where human intervention
is required for troubleshooting and finding the root cause, the SPECTRUM suite
provides many troubleshooting and analysis tools to aid this process such as
those in OneClick Console and statistical reports from Report Gateway.
SPECTRUM
products also enable Proactive Problem Management through trend analysis.
Report Gateway identifies abnormal patterns and trends in any statistical data
collected by SPECTRUM, proactively identifying things that are possible
problems for the future. Further, Service Performance Manager (SPM) proactively
monitors traffic to find performance trends that may become problems in the
future. Dynamic thresholding even enables alarms to be generated as trends vary
too far from normal.
3.4 Configuration Management
Information
is crucial to making decisions. Without detailed, up-to-date, and accurate
information of the IT infrastructure, poor decisions are made. The goal of the
Configuration Management process is to ensure all information regarding the IT
Infrastructure, its components, and services it supports is detailed,
up-to-date, and accurate. The information includes details about each specific
component (Configuration Item - CI), as well as the relationships between each
CI, and is stored in the Configuration Management Database (CMDB).
Responsibilities
of Configuration Management include:
·
Planning to define the strategy, process, and
objectives of Configuration Management.
·
Identification where the data models are
defined to record the appropriate information on each CI and the relationships
with other CIs. It also includes defining the methods of adding new CIs and
updating existing CIs to ensure the CMDB is always up to date and has the
appropriate information.
·
Control to ensure only authorized changes and
additions are made to the CMDB.This requires appropriate approval and
documentation such as an approved Request for Change in order to make any
changes to the CIs.
·
Status Monitoring to indicate the status
lifecycle of each CI from under development, test, stock, live use, to phased
out. · Verification to audit the infrastructure and verify the CI’s existence
and actual information to ensure the CMDB is correct. · Reporting to provide
information on the CIs and their relationships to the other ITIL processes.
The
SPECTRUM product suite helps with Configuration Management by providing
information on what should be stored in the CMDB, the relationships between
components, customer and owners, and in auditing the IT Infrastructure to
ensure the CMDB is accurate and up-to-date.
SPECTRUM
Report Manager identifies information required for CIs based on SPECTRUM’s in
depth knowledge of the infrastructure. This information includes things
such as:
·
What IT Components are
being used with asset usage reports
·
What needs to be upgraded with asset reports
showing current versions of firmware and hardware · The status of different
components and their availability to see which components are contributing to
problems
·
Inventory reports to
help verify the CMDB
In
addition to Report Manager, Service Manager manages the relationships between the
IT components allowing them to be effectively managed. It provides information
such as what will be affected by a change and what needs to be considered. This
includes determining maintenance times based on the effect maintenance of a CI
will have on the business services. It also relates the CIs to services and
customers thereby relating them directly to the business. This information may
be used by financial management.
Finally,
SPECTRUM Configuration Manager helps configuration management by verifying
configurations of components to be sure they have not been changed without
updating the CMDB, helping to keep the CMDB in sync with the actual
environment.
3.5 Change Management
“Not
every change is an improvement, but every improvement is a change” - Motto of
Change Management from ITIL. A change is anything from a minor installation to
a major reconfiguration and service rollout. While some changes can be planned
well in advance, others are required quickly to fix an incident or a major
problem. Planned or unplanned, without a good change management process,
changes are often the cause of many incidents, thereby consuming unnecessary
resources for troubleshooting and correction. The goal of change management is
to manage the change process so changes can be done quickly and with little
impact on the services thereby limiting the incidents resulting from the
change.
The
responsibilities of Change Management include:
·
Recording all Requests
for Change (RFCs). An RFC is submitted for every non-standard change. A well
known and clearly defined modification, such as a password reset, can be
handled with a Service Request to the Service Desk.
·
Accepting RFCs to ensure all the required
information to make decisions is included. If the required information is not
included, the RFC is rejected and must be resubmitted.
·
Classification of the RFCs after they are
accepted where a priority and category (minor, substantial, or major impact) is
assigned. · Planning of changes to ensure they are implemented in the required
timeframe with the proper approvals and understanding the true impact and
required resources. · Coordination of the change with the appropriate IT
personnel to make the change. Making the change includes building, testing, and
implementing the change.
·
Evaluation of the change after it is
implemented to determine if it met the required objective, if users are
satisfied, if there were side effects of the change, and if costs and resources
were within budget.
·
Implementing urgent changes where there is not
time to follow the full process. In this case, the evaluation and other steps
must be completed following the change in order to ensure traceability of the
change.
The
SPECTRUM product suite contributes to the evaluation of changes. Particularly,
SPM and SRG (as well as core SPECTRUM) help IT understand the impact of any
changes on the rest of the infrastructure.This is evaluated through comparing
trends before and after the change. Further, they help identify if the change is
successful, particularly in cases where a change is made to correct a
performance issue.
In
addition to products supporting Change Management, we provide different product
release mechanisms to fit within the change management process. Often times,
changes are required to fix a small problem and need to be done quickly. In
this case, we provide hot-fixes. Also, service packs and full releases are
provided to allow bundled changes to the software to make many changes at once
in a controlled environment.
3.6 Release Management
The
goal of release management is to have a process to be sure all technical and
non-technical aspects of a release are covered when planning for a release. These
processes are intended to ensure a reliable production environment to deliver
high quality services. Though it is related to Change Management, it is
concerned with the implementation of the changes through releases.
Responsibilities
of Release Management include:
·
Release policy and planning including details
from figuring out all the dependencies, coordination of the release with
schedules and communication plans, determining the testing plans, and other
details related to the release.
·
Design, building, and configuration of the
release. This includes testing the release in a lab environment and creating
full documentation on how to recreate the configuration help ensure a
successful implementation in production. This activity also includes creating a
back-out plan in case there are problems during implementation that require the
release to be aborted and to restore the previous environment.
·
Testing and release acceptance includes
functional testing by a subsection of the users and operational testing by IT.
·
Implementation planning augments the release
planning by including all the details for task schedules and required
resources, all items that must be changed, communicating to all affected people,
etc. Releases are either done in one full release or in stages.
·
Communication, preparation and training for
all parties who communicate with customers and users to ensure they can
communicate the right information. Further, changes to all SLAs, OLAs, and
Underpinning Contracts must be communicated as early as possible prior
to the release.
·
Release distribution and installation includes
the actual rollout. Rollout includes purchasing the appropriate equipment, implementing
it, ensuring the Configuration Management DB is updated, and verifying a
successful implementation.
Although
we do not provide specific “release management” software, the SPECTRUM products
and processes enable release management with regard to the management system.
Lab licenses are available at a reasonable cost to enable our customers to
acquire a duplicate of their production environment for their lab, enabling
complete SPECTRUM release testing for upgrading to new versions, rolling out new
modules, or making changes to configurations.We also provide bundled service
packs that are fully tested for migration. Further, add-on products are
included within service pack releases allowing customers to roll out new
products without a major release of SPECTRUM.
We
also provide a best practices guide for installation and upgrades of SPECTRUM
products to help with release planning and implementation. Finally, we have
focused development on providing simplified installation and administration for
maintaining customizations during major and/or service pack upgrades, allowing
distributed installation, and producing the OneClick architecture with single
point of administration.
3.7 Service Level
Management
The
Service Level Management process is more involved than many IT people think
going beyond simply defining a technical service and monitoring that service.
It includes everything from negotiating with the customers, defining the
service with regard to the customer requirements, managing and measuring the
service and improving quality over time; all within an acceptable cost
structure. Service Level Management creates a relationship between the IT
Service Organization (provider) and the customer allowing them to agree upon
the level of service that will be delivered in order to meet the business and
financial needs. It also allows a mapping between the technical requirements to
deliver the service (for the IT organization) and common business language
describing the service and the guarantees (for the customer).
Responsibilities
of Service Level Management include:
·
Developing a Service Catalog to provide
details of services offered; written in language that can be easily understood
by customers. · Identifying customer needs includes understanding their
business processes and requirements to enable creation of the proper services
and SLAs.
·
Defining services to meet the needs identified
with the customer. The services are documented in the Service Level Requirements,
in a non-technical language that can be understood by the customer.
Specification Sheets including both the detailed customer requirements and how
this will impact the IT Organization are also developed.
·
Creating contracts required for each service:
o
Service
Level Agreements between the provider and customer.
o
Operational
Level Agreements between internal organizations required to deliver the service
levels.
o
Underpinning
Contracts between the provider and any external suppliers required to deliver
the service at the guaranteed levels.
·
Implementing, monitoring, and reporting on the
services on a regular basis.
·
Creating Service Improvement Plans (SIPs) to
continuously improve IT Services helping both the provider and the customer
remain competitive.
The
SPECTRUM suite of products plays a role in many aspects of the Service Level
Management process. The product suite is focused on the technical side of
Service Level Management after the SLAs have been negotiated and defined
between the provider and the customer. Once defined and the technical
requirements have been ironed out, SPECTRUM, SPM, iAgent, and Service Manager
are configured to monitor and manage the services and agreements (SLAs, OLAs,
and UCs). Service Manager's templates simplify this process by enabling
providers to align the service templates with their service catalog and SLA
templates with their typical SLAs allowing quick implementation of service and
SLA monitoring.
Service
monitoring and management is accomplished through SPECTRUM's Service
Correlation and Fault Management, enabling providers to quickly find the root
cause of incidents and problems, enabling Incident Management to quickly
resolve incidents to maintain agreed upon service levels. SPM and other
SPECTRUM products integrate performance, system, application, and other
traditional management silos to Services and SLAs.
Part
of managing and monitoring services includes presenting the results to the
service managers, service desk, customers, and others.The Service Dashboard
provides real time status of services and SLAs. Service Manager and SPECTRUM
Report Manager provide the historical reports on the services and SLAs.These
reports are required for the Service Quality Plan to show how the service is
meeting the SLA on a regular basis. All reports may be scheduled on a regular
basis and/or done ad-hoc. Further, the service and SLA reports provide valuable
input to the SIP by indicating how the services have been performing and any
details regarding incidents down the root cause. This allows the service
manager to easily identify areas for improvement and create SIPs.
Finally,
detailed reports are provided to help with the service reviews with both
customers and internal IT to determine ways to improve the service by showing
not only service availability and performance, but also the root cause of any
incidents. This information can be used by Problem Management to recommend
changes to the infrastructure to better meet the SLAs and provide better
services at a greater profit.
3.8 Financial Management for IT Services
The
goal of Financial Management for IT Services is to enable IT to provide
services cost-effectively through the financial management of IT resources
required to deliver the service. This includes budgeting, accounting, and
charging. Through this process both IT organizations and customers alike become
more aware of the true costs of delivering the services, allowing them to make
better business decisions.
Budgeting
is based on planning for customers' needs and determining the cost associated
with delivery of services to the customers. Accounting consists of tracking the
expenditures of IT. It specifically tracks the cost by customer, service, etc.
Without Budgeting and Accounting, it becomes difficult to make business
decisions that balance cost of services with service quality.
Charging
includes all things required to bill the customer from the objectives to the
actual calculation methods. The advantage of charging, whether to internal
customers or to external customers, based on the quality of the service is that
it improves the provider/customer relationship. It opens up negotiations
regarding what will be provided for what price and allows customers and
providers to make informed business decisions based on cost vs. quality.
Although
SPECTRUM products don't provide budgeting, accounting, or charging tools, there
are many pieces of SPECTRUM such as integrations, SLAs, etc. that indirectly
support financial management through other ITIL processes that we support.
3.9 Capacity Management
The
goal of Capacity Management is to provide adequate capacity to meet the SLAs in
a cost effective manner. In order to reach this goal :
·
Avoid overspending on capacity that goes
unused.
·
Avoid guessing on what
will meet the service needs.
·
Plan for upcoming needs enabling better
purchase decisions.
The Capacity
Management process is made up of three sub-processes:
·
Business Capacity Management,
taking the customers’ business plans into account to predict future capacity
needs. After understanding the
customer business needs, providers typically determine the capacity required to
support new or modified applications by using models and simulations to
determine the requirements.
·
Service Capacity
Management to ensure SLAs can be met based on current and peak loads on the
infrastructure. This is done through monitoring and analyzing performance of
the monitored services and comparing to the resource load determined in
Resource Capacity Management.
·
Resource Capacity
Management to understand the use of the IT infrastructure by monitoring trends
on the resources needed to provide the services. This sub-process monitors metrics
such as CPU utilization, disk space, bandwidth utilization, etc.
In
both Service and Resource Capacity Management, trending is used to analyze the
monitored data to predict future trends for capacity planning. Also, based on
the trend information, these processes
are responsible to tune the infrastructure to best meet the capacity needs and
required changes are made within the Change Management process.
We
address the Service and Resource Capacity Management sub processes. SPECTRUM,
Service Manager, iAgent, and SPM are used to monitor and manage the
infrastructure to ensure the agreed upon service levels are met. Root Cause
Analysis through advanced correlation allows the provider to focus on the real
problems when they occur, while understanding service and customer impact.
Service
Manager, Report Manager, and Report Gateway all play a roll in analyzing data
to predict future utilization and highlighting capacity available for other
use. First, Service Manager is used to indicate the impact of putting a single
infrastructure device in maintenance mode, showing what services and SLAs are
impacted by a single device or group of devices (ITIL defines this as Component
Failure Impact Analysis (CFIA)). SPECTRUM Report Gateway provides statistical
trending reports for performance allowing performance analysis to see where
capacity is available or where more is needed. This includes data from
SPECTRUM, SPM, and iAgent for systems and applications. Finally, SPECTRUM
Report Manager provides asset reporting giving an understanding of the capacity
of devices in the infrastructure, highlighting unused capacity and showing when
available capacity is getting low.
With
respect to tuning the infrastructure for optimization based on trend data, we
enable capacity management on the SPECTRUM products. For example, OneClick
allows monitoring license use for all new products; understanding current,
average, and peak usages. Also, the Proof of Concept process allows customers
to meet the Capacity Planning requirement for testing of the system in their
own lab environment before implementing.
3.10 IT Service Continuity Management
The
main goal of IT Service Continuity Management is to support Business Continuity
Management by ensuring timely restoration of IT infrastructure and services
following a disaster.
ITIL
defines a disaster as: “An event that affects a service or system such that
significant effort is required to restore the original performance level.” Some
examples of disasters are fires, floods, burglary, large scale power outages,
etc. and by nature are more serious than an incident.
Responsibilities
of IT Service Continuity Management include:
·
Determining the scope of
IT Service Continuity Management based on business requirements.
·
Performing Business Impact Analysis to
determine the impact of a disaster and how long the business can survive
lacking any particular IT Service. · Performing Risk Assessment to determine
exposure based on assets, threats, and vulnerabilities.
·
Developing the IT Service Continuity Strategy
based on a combination of preventative measures and recovery options.
·
Implementing the plan, including the
preventative measures and recovery options, testing, training, and reviewing
and auditing, and assurance.
The
SPECTRUM product suite helps with IT Service Continuity Management in multiple
ways. SPECTRUM and Service Manager are capable of detecting IT Disasters,
providing the root cause analysis, and the impact of the disaster. This allows
the IT Service Continuity team to initiate the proper portions of the Disaster
Recovery Plan in a timely manner.
Further,
SPECTRUM products themselves are also designed to work well within a high
availability IT environment allowing timely recovery from disasters that affect
the infrastructure management system. Fault Tolerant licenses are available for
all products and included in both Integrity and Infinity products allowing
infrastructure management to be assumed by the fault tolerant server (this
could be at a separate physical location) in the event a disaster impacts the
primary server. Also, distributed configurations distribute the workload so any
one system impacted by a disaster does not have as great an impact on
management of the entire infrastructure.
3.11 Availability Management
The
main goal of Availability Management is to cost-effectively provide the level
of availability of IT Services required for successful business operation;
typically defined in the SLA. Availability consists of service up-time, degraded
time, time between failures, service recovery time, and other metrics related
to a user’s ability to effectively use the service. In order to have effective
Availability Management, the business aspects of the service and requirements
must be well understood. Also, up front planning must take place in order to
ensure the service will be capable of providing high availability and the
service must be monitored to ensure availability agreements are met.
Availability Planning includes:
§
Determining availability
requirements as the first step prior to defining SLAs. This must take the
business requirements of the customer into account and
be done for any new service as well as
any service changes. It includes quantifiable requirements and impact.
§
Designing for availability and recoverability
using the methods for risk assessment used in the IT Service Continuity
Management process as well as Component
Fault Impact Analysis used in the Capacity Management process. These are used
to ensure the requirements can be met at an acceptable cost. If they cannot, it must be determined if the
design can be modified to meet them or if the requirements need to be
renegotiated with the customer. This
§ responsibility also includes designing the monitoring and
management system as monitoring components of a service and restoring the
service are an important part of maintainability and recovery that must be
considered in the design phase.
§ Security Issues
such as access to information need to be planned up front as poor security
planning can affect the availability of a service.
§
Maintenance Management is taken into
consideration as it is required for continued high availability of a service
through upgrades and changes. The maintenance windows are typically designed to
set them at times where there will be the least impact to the service. These
maintenance windows must be respected by
the provider.
§
Developing the Availability Plan which is a
living document defining the current requirements and methods as well as
improvement and maintenance guide lines for the future.
Monitoring
includes measuring and reporting service availability. Reporting provides the
baseline information required to validate the service agreements, as well as,
solve problems and identify improvement opportunities. Availability monitoring
for services includes uptime, degraded time, Mean Time Between Failure (MTBF),
Mean Time To Repair (MTTR), and Mean Time Between System Incidents (MTBSI).
MTBF and MTBSI are similar. The difference is that MTBF is the time between the
end of the first failure and the start of the next where MTBSI is the time
between the start of the first failure and the start of the next. Dividing MTBF
by MTBSI, provides an indication if there are a large number of minor failures
or a small number of major failures.
The
SPECTRUM product suite contributes to the Availability Management process in
many ways. First, Service Manager measures and reports on service availability for
the Availability Management process. Service Manager also isolates and presents
the root cause of service failures significantly reducing the amount of
unavailability by improving the ability to respond quickly to faults. Service
Reports provide data about faults from previous incident and problem records,
required by Availability Management, allowing availability improvements. This
data is provided as summary information, as well as, detailed information about
every fault contributing to service availability problems.Also, to improve
availability, service reports are generated showing every infrastructure item
that contributes to a service outage, along with statistics on its contribution
such as number of outages caused, number of services affected, etc.
SPECTRUM's
maintenance mode combined with Service Manager's ability to incorporate planned
maintenance windows within SLAs, supports Availability Managements requirements
to schedule and respect planned downtime. Further, Service Manager shows the
impact to services and customers impacted by maintaining a specific device or
group of devices in order to plan the best maintenance windows with the least
amount of impact. In addition, with respect to availability of the Management
System itself, SPECTRUM is easily deployed in a fault tolerant mode for a high
availability management service with fast recovery times.
3.12 Security Management
The
goals of Security Management are to meet all external security requirements including
those in SLAs, contracts, legislation, etc. and to maintain base level security
in IT. The Security Management process includes:
·
Confidentiality to ensure only the authorized
users can access the information
·
Integrity to ensure the information is correct
·
Availability to allow access to information
when it is required, as defined in the SLA.
The
Security Management process is responsible for setting and maintaining, as well
as implementing the security policy. It is a continuous cycle of planning,
implementing, evaluating, and starting all over again with planning based on
the maintenance requirements gathered from the evaluation. This is a cyclical
process since the IT and Security landscape is always changing.
Although
SPECTRUM products do not directly support the Security Management process,
these products fit well within a secure infrastructure. Security is built in as
part of the SPECTRUM One Click UI Architecture both by requiring username and
password authentication as well as providing SSL encrypted connections from the
client to the web server. In addition, the web server is the only client to the
SPECTRUM Assurance Server allowing a trusted connection between the two and
allowing them to be located inside the same trusted domain.
One
Click Console with all integrated UI components, Service Dashboard, and Report
Manager offer the ability to set up user Privileges defining what specific
users or groups of users can and cannot see or do. This supports
confidentiality as well as integrity by removing the danger of unauthorized
users seeing information they should not have access to or making changes that
could affect the management of the infrastructure. For example, the
administrator may have privileges to edit services and SLAs, schedule reports,
and configure response time tests. While some users may not have privileges to
do any of these and others may have privileges to do a subset of them. Further,
the administrator can define information access permissions including access to
managed devices, specific reports, or even views within the user interface.
Some users or customers may only have a view of the topology while others may
be able to see alarms, etc.
Finally, SPECTRUM
takes advantage of SNMPv3 to provide secure management of any environment using
SNMPv3 on the infrastructure devices.
4- Summary
ITIL
is quickly becoming popular with IT organizations as they look to make IT
Service Management more effective in today’s business world, heavily reliant
upon IT Services. Implementing ITIL helps ensure the appropriate service levels
are delivered at an acceptable cost through proper business planning,
relationship management, and IT management.
We
are dedicated to supporting ITIL efforts in your organization. As outlined in
this paper, the SPECTRUM product suite and business processes significantly
contribute to your successful implementation of the ITIL best practices for IT
Service Management.